PCNN: Projection Convolutional Neural Networks

59

FIGURE 3.14

We visualize the distribution of kernel weights of the first convolution layer of PCNN-22.

The variance increases when the ratio decreases λ, which balances projection loss and cross-

entropy loss. In particular, when λ = 0 (no projection loss), only one group is obtained,

where the kernel weights are distributed around 0, which could result in instability during

binarization. In contrast, two Gaussians (with projection loss, λ > 0) are more powerful

than the single one (without projection loss), which thus results in better BNNs, as also

validated in Table 3.2.

curves) converge faster than PCNNs with λ = 0 (yellow curves) when the epoch number

> 150.

Diversity visualization In Fig. 3.17, we visualize four channels of the binary kernels Dl

i

in the first row, the feature maps produced by Dl

i in the second row, and the corresponding

feature maps after binarization in the third row when J=4. This way helps illustrate the

diversity of kernels and feature maps in PCNNs. Thus, multiple projection functions can

capture diverse information and perform highly based on compressed models.

FIGURE 3.15

With λ fixed to 1e4, the variance of the kernel weights decreases from the 2nd epoch to

the 200th epoch, which confirms that the projection loss does not affect the convergence.